Benchmarking seasonality tests

The CHTest for seasonality has shown itself to be... slow. This notebook demonstrates the speed (or lack-thereof) of the old-style CHTest in v1.1.0 vs. later iterations.

Setup

This portion won't change between versions of pmdarima. This dataset was submitted by a user in Issue #12 and showed a very slow performance on the CHTest. Therefore, it's effective for use in benchmarking.



In [1]:

    
import pandas as pd

X = pd.read_csv('item_sales_daily.csv.gz')
y = X['sales'].values
X.head()



In [4]:

    
import pmdarima as pm
import time
from functools import wraps


def timed(func):
    """A decorator to time a result"""
    @wraps(func)
    def wrapper(*args, **kwargs):
        start = time.time()
        res = func(*args, **kwargs)
        print("Complete in %.3f seconds" % (time.time() - start))
        return res
    return wrapper


@timed
def benchmark(x, test):
    res = pm.arima.nsdiffs(x, m=365, max_D=5, test=test)  # 365 since daily
    print("Version: %s" % pm.__version__)
    return res

Version 1.1.0



In [16]:

    
benchmark(y, "ch")









    



Version: 1.1.0-dev0
Complete in 9.775 seconds

Version 1.2.0



In [4]:

    
benchmark(y, "ch")









    



Version: 1.2.0-dev0
Complete in 9.621 seconds

Version 1.2.0 added the OCSBTest, which is orders of magnitude faster than the CHTest.



In [5]:

    
benchmark(y, "ocsb")









    



Version: 1.2.0-dev0
Complete in 0.012 seconds






    Out[5]:





0



In [ ]:

	date	sales
0	1/1/13	38
1	1/2/13	28
2	1/3/13	46
3	1/4/13	27
4	1/5/13	33